Multiple Extremal Eigenpairs of Very Large Matrices by Monte Carlo Simulation 
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We present a new Monte Carlo algorithm that allows the simultaneous determination of a few 
extremal eigenpairs of a very large matrix. It extends the power method and uses a new sampling 
method, the sewing method, that does a large state space sampling as a succession of samplings 
from a smaller state space. We illustrate the new algorithm by its determination of the two largest 
eigenvalues of the transfer matrix of a square Ising model at the critical temperature for sizes from 
16 X 16 to 48 X 48. 
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A common problem in computational physics is com- 
puting the eigenpairs of large matrices. We will present 
a new Monte Carlo algorithm that allows the simultane- 
ous determination of a few extremal eigenpairs of a very 
large matrix without the need to orthogonalize pairs of 
vectors to each other or store all the components of any 
vector. 

The new algorithm is an extension of the power (pro- 
jection) method [l|, which is the traditional starting 
point for a Monte Carlo determination of the eigenpair 
associated with the eigenvalues of largest absolute value 
Ai. While various Monte Carlo versions of the power 
method often compute this dominant eigenvalue very 
well, computing subdominant eigenvalues A2, Aa^ . . . has 
often proven much more difhcult and is much less fre- 
quently attempted [2|- Our Monte Carlo power method 
computes multiple extremal eigenpairs simultaneously 
and straightforwardly. For the particular algorithm pre- 
sented we also introduce a new sampling method, the 
sewing method, that does a large state space sampling 
as a succession of small state space samplings. Although 
our new algorithm is extendable to finding more than two 
extremal eigenpairs, here we will focus on finding just Ai 
and A2. 

Our ultimate targets are matrices so large that they 
are unassailable deterministically because no single vec- 
tor can be stored in memory. As the system size in- 
creases, finding a few extremal eigenpairs of the trans- 
fer matrix of the two-dimensional Ising model becomes 
such a problem. Its two extremal eigenvalues are also of 
significant physical interest: the logarithm of Ai is pro- 
portional to the free energy, and the ratio A2/A1 controls 
long range spin correlations near the critical point [3|, |j| . 
Onsager Q derived exact expressions the eigenvalues of 
this transfer matrix for any finite-sized system thereby 
providing a nearly unprecedented opportunity for non- 
trivially benchmarking our algorithm for exceptionally 
large systems. We comment however that there is no a 
priori restriction of our algorithm to problems in classi- 
cal statistical mechanics. It is also applicable to transfer 
matrices of quantum origin and to ground and low lying 



excited states of many Hamiltonian matrices. In addi- 
tion it has application to more widely diverse problems 
as the nuclear critically problem [5|. We also note that 
our extension of the power method is not limited to a 
Monte Carlo implementation [6|]. 

The transfer matrix A{a, a') of an rn x m Ising model 
with periodic boundary conditions in one direction and 
open boundary conditions in the other is a 2™ x 2™ matrix 
whose elements are [3| 



A {a, a') 
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with v — J/ksT, J is the exchange constant, ks be- 
ing Boltzmann's constant, and T equal to the temper- 
ature. The Ising spin variable /ifc has the value of ±1, 
Hm+i = Ml, and the symbol a = (^i, ^2, • ■ • ,Mm) de- 
notes a configuration of Ising spins. (There are 2™ pos- 
sible configurations .) Numerically, we represented a a 
by the first m bits of integers ranging from to 2™ — 1. 
We comment that all the elements of A{a, a') are greater 
than zero so the matrix is maximally dense, and because 
of the one open boundary, it is also asymmetric. From 
the Pcrron-Frobcnius Theorem [7[ we have that a dom- 
inant eigenvalue that is real and positive. Further, all 
components of the corresponding eigenstate are real and 
have the same sign. 

The power method [Ij for some real-valued M x M 
matrix A, not necessarily symmetric, is an iterative pro- 
cedure, started with some ip, normalized in a convenient 
but otherwise relatively arbitrary, manner, that cycles 
the two steps 



(j)^ All; 



(2) 



until some convergence criterion is met. If (Ai,'0i) are 
eigenpairs of A, then starting with some 



M 



■0 = X! ^a'^a 



a=l 



and specifying |Ai| > IA2I > IA3I > ••■ > \Xn\, we find 
after n iterations that 



A"V = A^ 



M 



^iV'i + y^ w, 



a=2 VAl 



(3) 



Accordingly, if) -^ ipi/WipiW and \\(f>\\ ^ Ai as n — > cx). 

Finding more than one eigenpair by the power method 
requires initiahzing the method with more than one start- 
ing point [l|, |8|. For methods of which we are aware, 
these starting points need to be orthogonal, and this or- 
thogonality needs to be maintatined, at least periodically, 
throughout the iteration. This is much more difficult to 
do in a Monte Carlo procedure than in a detcrminisitic 
one 0. 

In developing a Monte Carlo algorithm to estimate two 
extremal eigenvalues, we will exploit several observations 
of Booth [Sl, llO|. He noted that for any eigenpair (X,^) 
and for each non-zero component of the eigenvector, the 
eigenvalue equation Aip = Xip can be rewritten as 



EA^,t 
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and that similar equations can also be written for any 
number of groupings of components, 



A = 



ieRi j JG-R2 j 



E ^» 

JG-Ri 



E V'. 

i£R2 



E E^yV'j 
ieRL j 

E V'. 

jG-Rl 



(5) 

where the Ri are rules for different groupings. Here, we 
will exploit the fact that any two groupings, say 1 and 2, 
imply 



51 ^* 51 H^y^J = H ^* H H^yV'i 



(6) 
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This directly follows from ^. 

As do standard procedures for finding for just the two 
extremal eigenvalues, we also use two normalized, start- 
ing points V' = Eq ^aV'a and ip" = Y.^'^ but they 
need not necessarily be orthogonal [a, M, lly|- At each 
step of the power method, we apply A to them individ- 
ually; however, to prevent both from projecting to the 
same dominant eigenfunction, we adjust at each step the 
relationship between their sum to direct one to the dom- 
inant state and the other to the next dominant one. We 
do this in the following way: we start the iteration with 
-0 = -0' + 777/1". If at the n*'' step, ip' and tp" have iterated 
to -0' and ip" , then at the (n -I- 1)*'* step, we use (O and 
® 
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to obtain 
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(8) 



The algorithm thus is to apply A repeatedly to ■0' and 
-0". If two real solutions t/i and 772 of ^ exist, then 
update via 



(9) 





^' <- AiP' + rnAiP" 




V-" ^ AiP' + r]2Aip" 


otherwise via 






1^' ^ A^' 




0" ^ A^p" 



(10) 



After the ?7's become real, rji guides further iterations to 
(Ai,0i); ?72, to (A2,'02)- The eigenvalues are estimated 
from 
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where rji and r?2 generate the largest and next largest 
eigenvalue estimates. A justification of this procedure is 
given in [6|]. 

The Monte Carlo method is used to estimate the result 
of the repeated matrix- vector multiplication. In the basis 
defining the matrix elements of A, we write 

i 

and call the amplitudes Lij[ and w" weights even though 
they are not necessarily all positive nor are the sums of 
their absolute values unity. We assume that the elements 
of the M X M matrix A are easily generated on-the-fly 
as opposed to being stored. Next, we imagine we have 
N particles distributed over the M basis states and in- 
terpret Aij as the weight of particles arriving in state \i) 
on iteration n -f 1 per unit weight of a particle in state 
\j) on iteration n and will regard the action of A on a ?/; 
as causing a particle to jump from some \j) to some |i), 
carrying its current weight ujj , modified by Aij , to state 
\i). To do this, we define the total weight leaving state 
Ij) as 



Wi 



Ea 



(12) 



and the transition probability from \j) to \i) as 



(13) 



Instead of always (i.e., with probability 1) moving weight 
Aij from state |j) to state \i), we will instead sample a 
|z) from Tij and multiply the transferred weight by Wj 

For many Monte Carlo simulations, as is the case 
for the transfer matrix of the Ising model, the parti- 
cle weights defining the eigenvector associated with the 
largest eigenvalue can be made all positive. The sec- 
ond eigenfunction however must be represented by some 
particles of negative weight and some particles of pos- 
itive weight. For some jumps these negative and posi- 
tive weights must at least partially cancel to maintain 
a correct estimation of the second eigenfunction. When 
iV <C M, as is typical, this cancellation does not occur 
often enough in a Monte Carlo simulation without proper 
design. 

There are several ways to design the cancellation [9|. 
For our Ising simulations, we promoted this cancellation 
by sorting the particles into state order (a state is rep- 
resented by the bits of an integer) at the end of each 
iteration. Particles 1 and 2 are then sampled together 
according to the Arnow et al. scheme ll| , then particles 
3 and 4, and so forth. An ordered list means there are 
(typically) many nearby states \i) accessible from both 
particles £ and £ + 1 with nontrivial transition probabili- 
ties. 

As the iteration progresses, the absolute value of the 
weights of some particles becomes very large, and those 
of some others, very small. As standard for Monte Carlo 
methods with weighted particles, we stochastically elim- 
inated particles with weights of small magnitude and 
stochastically split those with large magnitudes. To do 
this we used a procedure called the comb [12| . 

The steps of the algorithm are: First, we initialize the 
weights of two vectors. For the Ising simulation, we se- 
lected Lu'i uniformly and randomly over the interval (0,1) 
and the uj'/ uniformly and randomly over the interval (- 
1,1). Then, for a fixed number of times we iterate. For 
each iteration we execute the jump procedure for each 
particle, place the particle list in state order, effect can- 
cellations, estimate the eigenvalues from ([TT1) . update i/;' 
and tp" , and then comb. 

If M is sufficiently small so we can store all components 
of our vectors, sampling from the cumulative probability 
Ci = X]fc=o '^kj works well. If the number of states gets 
too large (e.g., to > 12), then Ci cannot be sampled 
directly because it cannot fit in the computer's memory. 
In this case, we could just randomly pick from any state 
\j) any state \i) with probability 1/M instead of always 
picking a state \i) (i.e., with probability 1). The problem 
with this approach is that the Aij can have immense 
variation so that this simple sampling scheme is unlikely 
to work well as a Monte Carlo method. This situation 
is especially true for the Ising problem. A large part of 



such variations however can be removed by sampling the 
new state in stages and then sewing the stages together. 

To explain our sewing procedure, we will first assume 
that we can write any state \i) in our basis as a direct 
product of the states in a smaller basis, |z) = \i2) |ii). 
Instead of transferring weight Wj from state \j) to state 
\i) with probability Tij, we will use the a^ that would 
apply to the smaller set of states and then make an ap- 
propriate weight correction. For the Ising model, a^ is 
the transfer matrix of a smaller lattice size. 

For each smaller set of states, we rewrite the analogous 
transition probability from state |j) to state \i) as 

Uj = aij/wj, (14) 

and the analogous weight multiplier as 



E 



Ofej 



(15) 



We thus will sample \ii) and |z2) from the probability 
function 



t{i2,J2)tiii,ji) 



(16) 



The weight correction Cij , necessary to preserve the ex- 
pected weight transfer from state |j) to \i), satisfies 



A,j == Qjt{i2,J2)t{ii,ji) 



Thus 



Cij — Wj^ Wj2 



A,, 



a(«i,ii)a(«2,J2) 



(17) 



(18) 



The sewing method generalizes easily. For k sets of 
states, (fTT]) and (fT5]) become 



Aij — Cij J^J^ t(ln,Jn) 



n=l 



with 



Cij — Aij 1 1 



n=l ^V"rn]7i) 



(19) 



(20) 



For the Ising problem, we took for state |ii) first m/k 
bits of the integer representing |i); for |«2), the second 
set; etc. The weight correction becomes 



Cy = exp(i^Dj) Jl 



(21) 



where apart form a factor oiv, Di is the energy difference 
between calculating with the bits together and the bits 
separately. It is straightforward to calcualte. 

Using this sewing algorithm for the sampling of states, 
we computed the first and second eigenvalues for m = 1& 



TABLE I; Our Monte Carlo power method estimates of Ai and A2 and their statistical errors for variously sized square Ising 
models. Each estimate was based on 20 independent simulations. Also given are the values computed via Onsager's exact 
results Hi. 



Matrix Size 



Ai (Onsager) 



Ai 



A2 (Onsager) 



A2 



2^^ X 2' 
232 ^ 23 

2*» X 2" 



fi 



2.93297 X 10' 
8.39316 X 10^2 
2.41504 X 10^^ 



2.93307 ± 0.00008 x 10** 
8.39311 ± 0.00049 x lO^^ 
2.41522 ±0.00019 x lO" 



2.79225 X 10'' 
8.18959 X 10^3 
2.37584 X 10^^ 



2.79482 ± 0.00010 x 10" 
8.18807 ±0.00061 x lO^^ 
2.37481 ± 0.00054 x 10^® 



to TO = 48 Ising models by sewing 6 sets of 8 bits. We 
note that 2"'^ « 2.8 x 10^''. The results are shown in Ta- 
bic m To compute averages and standard errors for each 
size, we executed twenty independent Monte Carlo runs, 
each with 1 million particles per iteration (5 million for 
TO = 48), 500 iterations per run and used only the sec- 
ond half of the iterations in each run for the estimation 
process, v = 0.4406867935097715 0,13], the value at the 
critical temperature. Presented are 3 systems sizes. For 
each we give the values of Ai and A2 predicted from On- 
sager's expression and from our enhanced power method. 
We see that our Monte Carlo produced eigenvalues agree 
very well with Onsager's predictions. For our eigenvalue 
estimates, i?i consisted of the states for which more than 
half of its TO bits were O's and R2 consisted of the states 
for which more than half of its m bits were I's. More re- 
sults and algorithmic detals will be given elsewhere [13j . 

We anticipate the sewing algorithm being applica- 
ble to other many-body problems defined on a lattice. 
Likely, these applications will require more sophisticated 
programming than for the Ising model. The "sewing" 
method for the Ising model worked well as high as 
TO. = 60, sewing together 6 sets of 10 bits. For to, > 60, 
our computer codes would need significant modification 
to implement a more flexible scalable procedure for repre- 
senting a state configuration requiring more than a single 
computer word. We do not know how large an m can be 
accommodated with a better computer program. 

The transfer matrix of the Ising model is real, positive, 
asymmetric, and dense. How is our algorithm changed if 
a matrix lacks one or more of these properties? For sim- 
ple test cases, we have successfully constructed determin- 
istic procedures for matrices whose elements are complex 
valued. Also, in this context, we have had success for 
real asymmetric matrices whose eigenvalues are complex 
valued. Devising Monte Carlo algorithms for real, sym- 
metric, sparse matrices has however received more of our 
attention 1J|. To find the eigenvalue of smallest size, if 
it not the one with the largest absolute value, one simply 
uses a shifted matrix, A —^ A — al. 

In closing, we believe that our new algorithm is accu- 
rate, easy to implement, and applicable to many other 
problems. Wider use of the algorithm will define more 
crisply its strengths and limitations than is possible by 
just the present application. The intent of the present 



application was benchmarking and not studying the scal- 
ing of the eigenvalues of the transfer matrix of the two- 
dimensional Ising model. Both deterministic and Monte 
Carlo power methods have been used for such studies. 
Deterministic methods jl5| have computed Ai and A2, 
while Monte Carlo methods [l6|, just Ai. The system 
sizes were considerably smaller (to < 25 deterministi- 
cally and to, < 21 stochastically) than the largest size 
(to, — 48) presented here. This size should not be the 
largest accessible by our methods. All our calculations 
were done on a single processor. 
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